# Read the Solution [Here](https://quastor.org/cracking-the-coding-interview/moderate/word-frequencies)

# Word Frequencies

## Cracking the Coding Interview (CTCI) 16.2

<br />

## Question

Design a method that finds the frequency of occurances of any given word in a book. How would you change the method if we had to run it multiple times for different query words?

<br />

<details>
  <summary>Single Query Solution</summary>

  We'll assume that the book is a list of words.

  <br />

  If we want to check for a word and we're only running the method once, the most efficient way is to just iterate through the list of words and see if any of them are equivalent to the word we're looking for.

  <br />

  We'll assume that capitalization doesn't matter, but this is something that you should check with your interviewer.

```python
import string  # Importing standard library is okay during interview

def getWordFrequency(book: List[str], word: str):
    count = 0
    word = word.trim().lower()
    for w in book:
        if word == w.trim().lower():
            count += 1
    
    return count
  ```

  This code runs in $$O(n)$$ time and takes $$O(1)$$ space.

</details>

<br />

<details>
  <summary>Multiple Query Solution</summary>

  The previous solution works great when you're only calling the method a few times.

  <br />

  However, as you start calling the method many times for different words, your method will have to search the entire book on each call.

  <br />

  Now it makes sense to do some preprocessing on the book so that each call to the method can be done faster.

  <br />

  We can create a data structure that keeps track of the number of occurrences of every word in our book. This can be done with a hash table (or dictionary).

  <br />

  This way, whenever we want to query the occurrence of a specific word, we can just search our hash table and return the answer in constant time.

  <br />

  Creating the data structure is quite simple. We create our dictionary and then iterate through the book string. If a word is already in our dictionary, then we update it's count by 1. Otherwise, we add it to the dictionary with a count of one.

  <br />

  Then, our `getWordFrequency` function will just query the dictionary for the number of occurrences of the given word and return the answer in $$O(1)$$ time.

```python
class wordFrequency:
    def __init__(self, book):
        self.wordCounts = self.setupDictionary(book)
    
    def setupDictionary(self, book):
        wordCounts = {}
        for w in book:
            w = w.trim().lower()
            if w == "":
                continue
            if w in wordCounts:
                wordCounts[w] += 1
            else:
                wordCounts[w] = 1
        return wordCounts
    
    def getWordFrequency(self, word):
        word = word.lower()
        if word in self.wordCounts:
            return self.wordCounts[word]
        
        return 0
```
</details>

<br />

<details>
  <summary>Further Enhancement</summary>


 Currently, the word counts for our books are stored in volatile memory. If our program crashes or we shut off the computer, we'll lose all the word counts.

  <br />

  You can fix this by using a key-value database like Redis. Redis is in-memory (so reads and writes are quite fast) and can be configured to be persistent so it writes to disk at regular intervals.

  <br />

  That way, you won't have to re-index your books if your program crashes.

  </details>
